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We consider an occupancy scheme in which "balls" are identified 
with n points sampled from the standard exponential distribution, 
while the role of "boxes" is played by the spacings induced by an 
independent random walk with positive and nonlattice steps. We dis- 
cuss the asymptotic behavior of five quantities: the index of the 
last occupied box, the number Kn of occupied boxes, the number 
Knfi of empty boxes whose index is at most K^, the index Wn of 
the first empty box and the number of balls Z„ in the last occupied 
box. It is shown that the limiting distribution of properly scaled and 
centered K* coincides with that of the number of renewals not ex- 
ceeding logn. A similar result is shown for K„ and Wn under a side 
condition that prevents occurrence of very small boxes. The condition 
also ensures that K„,o converges in distribution. Limiting results for 
Z„ are established under an assumption of regular variation. 

1. Introduction. The Bernoulli sieve is a simple recursive allocation of n 

"balls" in infinitely many "boxes" indexed 1,2, Let (,i,£,2, ... be random 

values sampled independently from a given probability distribution on (0, 1). 
At the first step each of n balls is dropped in box 1 with probability ^i. At the 
second step each of the remaining balls is dropped in box 2 with probability 
^2, and so on. The procedure is iterated until all n balls get allocated. It is 
easy to see that the probability that a particular ball lands in box j is equal 
to 

(1) P^.=^-i...e,_ie„ iGN, 

here and hereafter x := 1 — x. 
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Random discrete probability distributions with frequencies (Pj) of the 
form (1) are called residual allocation or stick-breaking models [1, 4, 24]. 
For instance, in the most popular and analytically best tractable case (Pj) 
follows the GEM (Griffiths-Engen-McCloskey) distribution, which appears 
when the law of is beta(l, ^) with some 9 > 0. 

Note that {Pj) is a (nonrandom) geometric distribution if the law of is a 
Dirac mass 6x at some point x G (0, 1); we shall exclude this and some other 
cases by the assumption that the support of .^i is not a set like {1 — x-' : j G 
No := N U {0}}. See [12] for a survey of results on sampling from nonrandom 
discrete distributions with infinitely many positive masses. 

A random combinatorial structure which captures the occupancy of boxes 
is the weak composition C* comprised of nonnegative integer parts summing 
up to n. We speak of weak composition meaning that zero parts are allowed, 
for instance, the sequence (2,3,0,1,0,0,1,0,0,0,...) (padded by infinitely 
many O's) is a possible value of C^. Two related structures which contain 
less information are composition Cn of n obtained by discarding zero parts 
of C* and a partition of n obtained by arranging the parts in nonincreasing 
order [these are (2,3,1,1) and (3,2,1,1), respectively, in the example]. In 
the GEM case, the law of the partition is widely known as Ewens' sampling 
formula (ESF), and the law of composition is a size-biased version of the 
ESF; see [1, 24]. 

Functionals of C* studied in this paper are as follows: 

Kn the number of boxes occupied by at least one ball, 
K* the largest index of occupied box, 

Knfl = K* — Kn the number of empty boxes with index not exceeding K* , 
Wn the index of the first empty box, 
Zn the number of balls in the last box. 

For r = 1, 2, . . . , n, we also denote by Kn,r the number of boxes occupied by 
exactly r balls. 

In [10] it was observed that Kn can be studied by tools of the renewal 
theory, and it was shown that under certain moment conditions the distri- 
bution of Kn is asymptotically normal. The composition Cn in this case 
has some common features with logarithmic combinatorial structures [1]; in 
particular, Kn exhibits a logarithmic growth. 

Throughout, we shall also rely on the following alternative construc- 
tion of C*'s. Let {Sk ■ k G No} be a zero-delayed random walk with a step 
distributed like (— log^i). For Ei,E2,.-. an independent random sample 
from the standard exponential distribution, also independent of (Sk), think 
of the event Ej £ {Sk-i,Sk) as a ball dropped in box k. A composition 
of n is defined as the sequence of occupancy numbers in the natural or- 
der of intervals {S^-i, Sk),k = 1,2, In what follows we will often use 

Ei^n < E2^n < ■ • ■ < En,n the Order statistics of Ei, . . . , En- 
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The equivalence with the Bernoulli sieve construction is established via 
the mapping y i-^ e~^,y > 0, which allows to identify ■ ■ ■£,k (we tacitly 
assume that this equals 1 for A; = 0) with exp(— S'fc),A; S No, transforms 
{Sk-i,Sk) into interval of size Pk and transforms {Ei, . . . , En) in a uni- 
form [0,1] sample. By this transformation, the event Ej E {Sk--i,Sk) oc- 
curs when the jth coordinate of the uniform sample falls in the kth interval 
(exp(— S'fc), exp(— This works, because a point sampled from the uni- 
form [0, 1] distribution falls in the feth interval with probability P^. 

As n varies, compositions C* satisfy the following consistency conditions: 

(SC) Sampling consistency: if one of n balls is chosen uniformly at random 

and removed from the box it occupies, the resulting weak composition 

has the same probability law as C*_^. 
(DP) Deletion property: if the first box is inspected and it turns out that it 

contains k balls, then deleting the first box^ yields a weak composition 

with the same probability law as C*_^. 

Condition (SC) follows from the independence of (5^) and {Ej) and ex- 
changeability of (Ej), and condition (DP) follows from the renewal property 
of (Sk) and the memoryless property of the exponential distribution. Both 
conditions also hold for the associated compositions, which means that the 
sequence (C„) is a regenerative composition structure, as introduced in [14]. 
Note that K„,,Kn,r are, in fact, functionals of the partition structure [4, 24] 
which is obtained by discarding the ordering of parts in the C„'s. 

The random walk (Sk) can be viewed as the range of a compound Poisson 
process, that is, a subordinator {Tg : s > 0} whose Levy measure is the dis- 
tribution of (— log^). One obtains a larger class of composition structures 
by considering a general zero-drift subordinator and using the open gaps 
comprising the complement of its range in the role of boxes. It is known 
that normal limits for the number of parts Kn are typical for regenerative 
composition structures whose Levy measure is infinite and has the right tail 
slowly varying at [3, 17], although Kn exhibits then growth faster than 
logarithmic. The limits are no longer normal if the right tail of Levy measure 
is regularly varying at with positive index, as, for example, it is the case 
for stable subordinators [16].^ 

In this paper we dwell on the case of the Bernoulli sieve and obtain consid- 
erable extensions of the results of [10]. In particular, we derive an exhaustive 
criterion for the existence of limiting distribution of properly normalized and 



^In the example, the elimination transforms (2,3,0,1,0,0,1,0,0,0,...) to (3,0,1,0,0,1, 
0,0,0,...). 

^It should be noticed that in the case of infinite Levy measure the closed range of 
subordinator is a Cantor set, thus, with positive probability the set of empty boxes between 
Ej-i^n and Ej^n is infinite. 
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centered (Theorem 2.1). Then, under a side condition, we do the same 
for Kn and Wn (Theorem 2.3). Among other things, this condition ensures 
our most dehcate result, which states that Knfi converges in distribution 
directly, without centering or scaling (Theorem 2.2). A similar result also 
holds for Kn^Q + i^n,i (Proposition 5.2). In the GEM case, the limiting law 
of Knfi is mixed Poisson (Proposition 5.1). Asymptotic properties of Z„ are 
revealed in Theorem 2.4. 

The rest of the paper is organized as follows. In Section 2 we formulate 
our principal results and give examples. In Section 3 we extend the idea 
of representing a regenerative composition via a Markov chain [14, 15] to 
the weak compositions. We also collect necessary distributional recursions. 
Other sections are devoted to study of particular functionals. The Appendix 
summarizes some known asymptotic results about the number of renewals. 

2. Main results. Consider the process that counts renewals 



Our idea is to connect possible convergence in distribution of {Ni^gn — bn) / o-n 
to some nondegenerate and proper probability law with the convergence of 
(K* - bn)/an, {Kn - 6„)/a„, (K„ - K^^i - bn)/an and (VF„ - 6n)/an to the 
same law. The first connection can be anticipated in the view of identity 
-fC* = Ne^ n by recalling the fact from the extreme- value theory that 
En^n — logn has a limiting distribution (of Gumbel type). 
Introduce the moments 



which may be finite or infinite. 

Theorem 2.1. The following assertions are equivalent: 

(i) There exist constants {an,bn:n £ N} with o„ > and bn € M such 
that, as oo, the variable (K* — bn)/on converges weakly to some nonde- 
generate and proper distribution. 

(ii) The distribution of {—log ^) either belongs to the domain of attrac- 
tion of a stable law, or the function P{— log.^ > x} slowly varies at oo. 

Furthermore, this limiting distribution is as follows: 

(a) Ifcr"^ < oo, thenforbn = fi~^logn andon = (^"^o"^ logn)^/^, the limiting 
distribution is standard normal. 

(b) // o"^ = oo and 



(2) 



Nt := inf{A: >l:Sk>t] 



t > 0. 



/i := E(— log.^) and cj^ := var(— log^) 
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for some L slowly varying at oo, then for 

and Cn any sequence satisfying liuin^oo nL{cn)/c^ = 1, the limiting dis- 
tribution is standard normal. 

(c) Assume that the relation 

(3) P{^<x}~ (-logx)-"L(-log3;) asx^O 

holds with L slowly varying at oo and a G [1,2), and assume that fj, < 
oo if a = 1, then for bn = fi~^logn, an = f^~^'^^^^^"c[i^gn] ^'^^ c,« any 
sequence satisfying lim„^oo f^-^(cn)/c" = 1, the limiting distribution is 
a-stable with characteristic function 

t ^ exp{-|t|"r(l - a)(cos(7ra/2) + «sm(7ra/2) sgn(t))}, t € M. 

(d) Assume that fj, = oo and the relation (3) holds with a = 1. Let c be any 
positive function satisfying \\m.x~^ooxL{c{x)) / c{x) = 1 and set Tp{x) : = 
^ /exp(-c(x)) ^y}/ydy- Let b be any positive function satisfying 
b{'ip{x)) ~ ip{b{x)) ~ X. Then, with bn = 6(logn) and an = 6(logn) x 
c(6(logn))/logn, the limiting distribution is 1-stable with characteristic 
function 

(4) th^exp{-|t|(7r/2 -ilog|t|sgn(t))}, t G R. 

(e) If the relation (3) holds with a G [0,1), then, for bn = and an : = 
log" n/L (log n), the limiting distribution is the scaled Mittag-Leffler law 
9a ( exponential, if a = 0) characterized by the moments 

/ x''9Jdx) = — — -, nGN. 

io r"(l-a)r(l + na)' 

Asymptotic analysis of the number of empty boxes ifn,o involves 

i/:=E(-logO. 

Our next result determines explicitly the limiting distribution of Knfl- 

Theorem 2.2. For n—> oo, Knfl has the following asymptotic proper- 
ties: 

(a) If ly < oo, then Knfl converges in distribution to a random variable Koofl- 
If also fi < oo, then 

1 ^ E^-^' 

nKoofl>i}=-y2—nKjfl=i-i}, ^gn, 

and E,Koofl = v/ ^, but if ^ = oo, then i^oo.o = a.s. 
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(b) Assume that for some 6 > both 
(5) EC^<oo and E^~^ < oo, 

then 

lim KKn = € (0, oo). 

n — >co ' 

On the other hand, if v = oo and fi < oo, then lim^^oo IE-ftr„^o = c>o- 

If ;U < oo, the limiting variable Kq^oo has interpretation in terms of a model 
with infinitely many "balls" and "boxes" [13]. Specifically, one can take gaps 
between consecutive points in a stationary renewal process on M in the role 
of "boxes," and points of an independent Poisson process with the intensity 
measure dx {x G M) in the role of "balls." Other functionals of C* like 
Kn^r the number of parts equal r also have limiting forms realizable in the 
infinite model. 

By virtue of Kn = K* — Knfl and because Theorem 2.1 implies that 
a-n — > oo, one can conclude that boundedness of i^n,o {^n,i) in probability 
would lead to the following fact: if {K* — hn)/an weakly converges to some 
proper probability law, then {Kn — hn) / a-n ( {Kn — Kn^i — hn)/an) weakly con- 
verges to the same law. According to Theorem 2.2 (Proposition 5.2), the con- 
dition V <oo ensures even a more delicate property that Knfl {Knfl + Kn^i) 
converges in distribution. Similar argument applies to Wn and leads to the 
next result. 



Theorem 2.3. If f < oo, then all the assertions of Theorem 2.1 remain 
valid with Kn replaced by Kn, Kn — i^n,i or Wn- 

Under the condition cj^ < oo, the normal limit for Kn was established in 
[10], Proposition 10, by a method which required asymptotic expansion of 
moments. A generalization for a larger class of random compositions ap- 
peared in [11], Proposition 8. 

Theorem 2.4. If n < oo, then Zn—^Z as n ^ oo, where Z has distri- 
bution 

F{Z = k} = ^, k = l,2,.... 

If (3) holds with some a € [0, 1), then for a € (0, 1) 

logZ„ d , . ,^ . 

> beta(l — a, a) 

logn 
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and the degenerate limit distribution 5i appears for a = 0. If (3) holds with 
a = 1 and if fi = oo, then with m{x) := /(^P{— log^ > y} dy we have the 
convergence 

m{\ogZn) d 



m{\ogn) 



uniform [0, 1]. 



In the examples to follow Xn stands for any of the variables K*, Kn, 

Kn - Kn,l or Wn- 

Example 2.5. Assume has a beta(c, 6) density 

nC(^dx} = ^——^ dx, xe[0,l], 

B{c, b) 

with some c, 6 > and B{-, ■) denoting the beta function [hence, the law of 
.^1 is beta(6, c)]. In this case the moments are finite and given by 

lJi = ^{c + b)-^{c), 

a'^ = m'{c)-^'{c + b), 

where ^{x) = T' {x) /T{x) denotes the logarithmic derivative of the gamma 
function. Therefore, as n ^ oo, 

X„, -/i-^logn d 

normal(L), Ij. 



{n '^o"^ logn)-*^/^ 

Above that, Zn—^Z with Z having distribution 

r(c + 6) Tik + b) 

The number of empty boxes -fCn.o converges in distribution and in the 
mean to a random variable K^ofi with some nondegenerate distribution. For 
6 7^ 1 an explicit form of the limiting distribution is still a challenge. For 
6=1 Proposition 5.1 gives the generating function 

E.AW,r(i + c)r(i + e c.) 

r(l + 2c-cs) ^ ^ 

In particular, for integer c the distribution of -fCoo,o is the convolution of c 
geometric distributions with parameters k~^{k + c), /c = 1, 2, . . . , c. 

Example 2.6. Suppose ^ has distribution function 

n^<^} = -r^ — , xG(o,i). 

1 — log X 
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Then the condition (3) holds with a = 1 and = oo. Since 

p{-ioge>x}= x>o, 

1 — log(l — e ^) 

for X ^ oo we have P{— log^ > x} ^ e~^, whence < oo. Therefore, as n — > 
oo, 

(loglogn)2 1,1 

(oj A„ — loglogn — logloglogn 

logn 

converges in distribution to the spectrally negative 1-stable law with char- 
acteristic function (4). Since P{— log^ > x} = (x + holds for x > 0, the 
normalizing constants in (6) can be calculated in the same way as in [20], 
Proposition 2. Above that, 

^ °^ — - uniform[0, 1] and Kn q—^Sq. 
log log n 

Example 2.7. For ^ with distribution 

1 — log(l — X) 

we have o"^ < oo but = oo, hence. Theorem 2.3 is not applicable. 

3. Compositions, Markov chains and recursions. Weak composition C* 
can be identified with a path of a time-homogeneous nonincreasing Markov 
chain on integers which start at n, terminate at and have nonnegative 
integer decrements equal to the parts of C*. Similarly, composition Cn can be 
identified with the path of a Markov chain Qn, whose decrements are positive 
until absorbtion at state 0. For fixed n, in terms of "balls-in-boxes," Qnik) 
is the number of exponential points which fall outside the first k spacings 
induced by {Si), and Qn{k) is the number of exponential points which fall 
outside the first k spacings containing at least one of Ej, for k = 0,1, . . . . 

Following terminology from [14], the transition probabilities are deter- 
mined by the decrement matrices 

(7) (n : m) := ^ ) Ei^-^^r ] , m = 0, . . . , n, 

(8) q{n:m):=^^)^-^, m = l,...,n, 

which specify the probability distribution of the first part of C*, respectively, 
Cn- By this representation, q*(n:m) and q{n:m) are the transition proba- 
bilities from n > to n — m for the Markov chains and Qn, respectively. 
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Introduce the total frequency of boxes whose indices are larger than j, 

From the construction of C* it is clear that 

(9) F{QUj) =n-m}= (^J^) E[C;-"^(1 - C,)"], 

which is the multistep generalization of (7). Also, 

(10) F{K: >k} = F{Q:{k) > 0} = E[l - (1 - a)"]. 

The variables we are interested in have obvious interpretation via the 
Markov chains. Thus, the absorbtion time , that is, the number of steps 
the chain needs to approach is /^*, and the absorption time of Qn is Kn- 

The Markov property leads to distributional recursions 

(11) K^:=0, K^K%^ + 1, nGN, 
and 

(12) Ko = f), Kn = KA^ + l, nGN, 

where is assumed independent of {K* : j G N} and A* = (5j^(l); An is 

d 

independent of {Kj -.j G N} and An = Qn(l)- So the law of A* is q*{n: •) 
and the law of A^ is q{n : •). 

Now let Vn be the number of balls that fall to the right from the first 
empty box. For instance, for weak compositions (1,2,1,0,2,0,0,3,0,0,...), 
(1,2,1,2,3,0,0,...) and (0,1,2,1,2,3,0,0,...), the value of Vq is 5,0 and 9, 
respectively. Then 

(13) Ko,o = 0, Kn,o = Kv„,o + l{v„>o}, neN, 

where on the right-hand side Vn is independent of {i^n,o ■iT' S N}. Here and 
below, 1{...} is 1 if • • • holds true and is otherwise. Furthermore, 

(14) K*^=0, K:^Kp^+Wn-l{v„=o}, nen, 

where on the right-hand side {Vn, Wn) are independent of {K* '■ n G N}. Fi- 
nally, we remark that 

(15) Kn,0 = KAl,0 + l{A*„=n}, n G N, 

where j4* is independent of {Knfl : n G N}. 
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The chain Qn visits a given state m with the same probabihty as Q* . We 
denote this probabihty by 

oo 

g{n, m) : = P{Q:(0) = m} + ^ nQnij) = Q;(j - 1) / m} 

oo 
i=0 

In principle, g can be computed from (9), but there is a simpler and more 
general formula which involves only E[l — ^'^J , A; = m, . . . , n; see [14] , Theorem 
9.2. 

It was shown in [10], Proposition 5, that under the assumption fj, < oo 

V ( ^ 1-^^'" 
(ibj hm g[n, m) - 



and the same argument^ allows one to show that liuin^oo g[n,7n) = if 

11 = CO. 

4. Index of the last occupied box and a proof of Theorem 2.1. By results 
of the renewal theory (which we summarize in Proposition A.l), it is enough 
to show the equivalence 

(17) -^n ~ Mogn ~ 

Cln CLji 

where X is a random variable with a proper and nondegenerate probability 
distribution and Nt is as in (2). Assuming that the convergence in the 
left-hand side of (17) holds with ^ oo, we have for ?/ > 

an J I On 

< p( ^'°g"+^-^" > x\nEn,n - logn < y} 

+ V{En,n-\ogn>y}. 

By subadditivity of the number of renewals, N\ogn+y does not exceed stochas- 
tically the sum Ni^gn + Ny with independent terms and Ny = Ny, hence, we 
can estimate the above further as 

^ pf iVlogn-^n ^ Ny ^ ^ jp|^ _ log„ < y} + - logn > y}. 

i an an ) 



*The proof on top of [10], page 86, must be corrected by changing n — mtoji — m + 1. 



THE BERNOULLI SIEVE REVISITED 



11 



By the selection principle, there exists an increasing subsequence (n^) such 
that the variable {Niog^H; — bnk)/0'nk converges weakly to some measure F' , 
say. Recalling the convergence of -En,n — logn and sending y to cxd, we have 
F{x,oo) < F'{x,oo) at all joint continuity points of F{x,oo) and F'{x,oo). 
Similarly, for y < 0, 

I On J I an J 

> pf iVlogn-^n _ ^ > x\F{En,n " log n > y}. 
I «n On J 

Letting again n — > cx) along (rik) and then sending y to — oo, we have 
F{x,oo) > F'(x,oo) at all joint continuity points of F and F' . Therefore, 
F = F' and since the limit does not depend on subsequence, we conclude 

that (A'^iogn, -bn)/an-^ X. 

Obviously the number of balls outside the first box, goes to oo a.s., 
which together with an application of (11) implies that K* cannot converge 
in distribution if no scaling or centering is imposed. Nor can K* — bn, for any 
unbounded sequence bn > 0. Indeed, if the convergence were the case, from 
the convergence of -En,n — logn and a.s. monotonicity of iVj would follow 
that A^iogn — bn were bounded in probability, which is known to be false. 
Following the same line of argument, one can prove that (if* — 6„)/a„ also 
cannot converge in distribution if a„ is either bounded or unbounded but 
does not go to oo. 

To establish the result in the reverse direction, we prefer to exploit the 
multiplicative form of renewal process. For each e > define 

M^^) := inf {A; > 1 : n^i^ • • • Cfe < ^l, nen, 

and notice that Mn^ = Ni^gn- Assume that {Mn^ — bn)/an — > X, where X 
is a random variable with a proper and nondegenerate distribution F. By 
Proposition A.l from the Appendix, F is continuous and a„ slowly varies. 
Also, Proposition A.ljprovides an explicit form of 6„. Using this, we conclude 
that {Mn^ — bn)/cLn X, no matter what e is. 

For fixed x G M and n sufficiently large put kn := [a„x + 6„J. Since for 
large n 

E[i - (1 - CkT] > ni - (1 - aj"i(a„ > eM] 

>(l-(l-e/nr)P{M(-)>M, 

letting in (10) first n — > oo and then e — > 0, we obtain liminf^^oo IPj-f^^n ^ 
kn} ^ -F(x,oo). On the other hand, for large n, 

E[l - (1 - CkJ'] < (1 - (1 - e/n)")P{A4^) < kn} + P{a4^) > kn}. 
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Sending in (10) first n — > oo and then e — > oo, we obtain limsup^^o^PlJ^* > 
kn} ^ F{x,oo). Combining the lower and upper limit, we conclude that 

{K^ — bn)/an -i- X, as desired. 

5. The number of empty boxes and a proof of Theorem 2.2. In the 

setting of GEM distribution, that is, when = beta(l,^), the distribution 
of the number Kn,r of boxes occupied by exactly r balls is asymptotically 
Poisson(0/r), for every r > 0. See, for example, [1], Theorem 4.17, where 
the fact appears in connection with the cycle structure of random ^-biased 
permutations. Quite unexpectedly, the limit law of i^n,o is not Poisson. In the 
spirit of discussion after Theorem 2.2, the limit variable may be interpreted 
in terms of the Poisson process Hi (boxes) of intensity (O/x) dx, x € M+, and 
another independent rate-1 Poisson process 112 (balls) on M+: i^o.oo is the 
number of gaps in Hi that are to the right of the leftmost atom of 112 and 
do not contain points of 112. 

Proposition 5.1. If has beta(l,0) distribution {0>O), then Knfi 
converges in distribution to a variable -ftToo.o with 

which is the generating function of a mixed Poisson distribution with random 
parameter 9 \ log \ . 

Proof. For j = 1, 2, . . . let Mj = geometric(j7 [9 + j)) be independent 
random variables. A key fact is the representation 

(19) Kn,o = (Ml - 1)+ + • • • + (M„_i - 1)+ + M„. 

To prove this, one needs to set Mj = ^{k:Sk G {En-j,n, En-j+i,n)},j = 
1, . . . ,n (with the convention £'o,n = 0), which is the number of points of a 
rate-^ Poisson process which fall between consecutive order statistics. The 
assertion about the joint distribution of Mj's follows from the independence 
property of the Poisson process and the observation that the differences 
En,n - En-i,n, ^n-i,n " En-2,n, En,i - Enfi are independent exponential 
variables with rates l,2,...,n. Now, counting the number of empty gaps 
(5/c,5fc+i) which fit in {En-j,n, En-j+i^n), we see that this is M„ for j = n, 
and (Mj — 1)+ for j = 1, . . . , n — 1. 

Passing to generating functions, (19) becomes 

^,K„. n j{3 + 2e-es) 



n + d-esf^^ {j + e){j + e-es) 
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and (18) follows by sending n — > oo and evaluating the infinite product in 
terms of the gamma function [25]. The generating function of the stated 
mixed Poisson distribution is calculated by recalling that the generating 
function of Poisson(u) is e~'^^^~^^ and that the Mellin transform of beta(l, 9) 
is EC = eB{e,l + v), whence 

E[exp(0(l - s) logO] = 9B{e, 1 + 9- 9s), 

which is the same as (18). □ 

The proof of Theorem 2.2 will exploit the poissonization technique, a well- 
known approach that goes back at least to Kac [22] (see also [12, 23] for the 
application of this technique to the balls-in-boxes scheme). 

We shall first consider a sampling scheme in which exponential points 
Ei,E2,--- are thrown at the epochs of an independent Poisson process 
{Il{t):t > 0} with intensity one. After establishing convergence of i^n(t),0) 
we shall turn to that of Knfl- 

Proof of Theorem 2.2. (a) Convergence in the Poisson model. For 
n, I E No and t > set a^n^ := P{K„,,o = i}, 

^ +k 

f^'^it):=Y. -/^ and g'^Ht) := e'' f^\t) . 
k=i 

Notice that 

^(0) (i) + e-t = P{Kn(t),o = 0}, g^^ (t) = P{Kn(t),o = i}- 
The equality of distributions (15) is equivalent to the following equalities: 

n-1 

aP = l, af = ^4°¥{< = A;}, n G N; 
k=0 

n-1 

a« = 0, = at'W + E ^^n^l = k}, i, n E N, 

fc=0 

from which we deduce after some calculations 
5(°)(i) = ng^'Hm + E[e-*«] - e-* - e-*E[/(o)(te)] =: ^[^("^(tO] + /(*); 
ff«(t) = E[5«(te)]+e-*(E[/(*-i)(te)]-K[/W(tO]), ^GN. 
Fix any to ^ and define 

flit) := l(t > to)(E[exp(-e*0] - exp(-e*)), 
f2{t) ■■= lit < to)(E[exp(-e*e)] - exp(-e*)), 
hit) := lit > to) exp(-e*)E[/W(e*0], 
/4(t):=l(t<to)exp(-e*)E[/W(e*0]. 



14 A. V. GNEDIN, A. M. IKSANOV, P. NEGADAJLOV AND U. ROSLER 
Since g^'^^ is bounded and ^^'^^(0) = 0, 

5(°)(e*)= / f{e'--)d(f;^F{Sn<u}]. 

\n=0 / 

If it were shown that fj, j = 1,2,3,4, was directly Riemann integrable 
(dRi) on M, then since /(e*) = + f2{t) — fsit) — f4{t), we could apply 
the key renewal theorem to conclude that 

hm P{i^nw,o=0}= lim ^^(e*) 

t—>oo ^ ' t—*oo 

(20) 



1 r/w,,^i_ig55!p{^^„^„,. 



We will only prove that /a and are dRi, the analysis of /i and /2 being 
similar. Since /a and are continuous and positive on the sets {t < to} and 
{t > to}, respectively, it suffices to find dRi majorants. We have 

hit) < l{t > to)(E[exp(-e*(l - C))] - exp(-e*)) 

< l{t > to)E[exp(-e*(l - 0)] =■ f^t), 
hit) < l{t < to)(E[exp(-e*(l - O)] " exp(-e*)) 

< l(t<to)(l-exp(-e*))=:/6(t). 

The functions /s and h dRi, since they are bounded, monotone on 
the sets {t < to} and {t > to}, respectively, and integrable. Integrability of 
/s follows from the condition < oo. This completes the proof of (20). 

Arguing in the same manner as for the case i = 0, we conclude that for 

t^oo ^ " t^oo ^ Jo t 

= - E ^(^^{^.-0 = i - 1} - P{i^,,o = ^}). 

^3=1 ^ 



Assume now that = oo and z/ < oo. It suffices to prove that, as t ^ oo, 
^(0) (t) = e-* Y.T=o ^ 1- Since (0) = 1, is bounded and satisfies 

gW(t) = Eb(°)(ie)]-e"*E[/W(tO], 

we conclude that 
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In the same way as in the first part of the proof we check that the key 
renewal theorem apphes to yield 

lim e* = 1 - - / ^—^du = 



9 . , , 



(the last integral converges in view of the condition u < oo). Thus, we have 

, -firn(t),o -^00,0- Notice that, if /i 

oo oo TTTi^l 1 OO 



already proved that, as t ^ oo, -firn(t),o -^oo,o- Notice that, if /i < oo, then 



oo -| oo oo TU' 1 ^ Wf^J 

i=i i=i j=i J ^ j=i J 

(b) Depoissonization. For any fixed e G (0, 1) and x > 0, we have 

F{Kuit),o > x} 

< P{i^n(t),o > X, L(l - e)t\ < U{t) < L(l + e)t\ } + P{|n(t) -t\> et} 
<P( max Kio>x\+F{\U(t)-t\>et} 

= F{i^L(i-.HJ,o > + lP{^L(i-.W,o < L(i..W-,T|f<L(i+eW > 
+ P{|n(t) - 1| > et} := hit) + hit) + hit). 
Similarly, 

P{Knw,o < x} 

<P{KL(i+.)tj,o<^} 

(21) 

+ P{i^L(l+eW,0 > X, L(,„,),j<-i{|,+,),j„,^^.o < x} 

+ P{|n(t) - t| > rt} := Ji(t) + J2(t) + /3(t)- 

If exponential points . . . , fall to the left from the point 

^L(i-£)*J.L(i-e)tJ ' ^^^"^ 



and also 



max -fCj < Kin-fWi n 



K|(-i+fVin< min Kio, 



which means that neither the event defining hit), nor J2(i), can hold. There- 
fore, 

max(/2(t), J2(t)) < IP{j(,„,^,j -|f<f(,_,,),j > ^L(i-.W,L(i-.W } 
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= ]E(i _ (1 _e-^L(i-.)tJ,L(i-.)tJ)L(i+^)*J-L(i-s)iJ) 

L(i-g)tJ 

[{l+e)ty 

By a large deviation result (see, e.g., [2]), there exist positive constants 6i 
and 62 such that, for all t > 0, 

hit) < 5ie-''\ 

Select now t such that (1 — e)t = n G N. Then from the calculations above 
we get 

P{^n(n/(i-e)),o > x} 

< ¥{Knfi >x} + l- n/[{l + e)n/{l - e)] + 61 exp-^^n/Ci-e) _ 
Sending first n | 00 and then e ], 0, we obtain 

liminf P{if„ > a^l > FjEToo > a^l 

at all continuity points x of the right-hand side. The same argument applied 
to (21) establishes the converse inequality for the upper limit. 

(c) Convergence of the mean value. Denote by H{x) := X^fc^o^l'^'fc < x} 
the renewal function and notice that 

1 



(22) 
We have 



e"^^ dH{x) 



1 - E^^ ' 



s>0. 



n.O 



E 



^((1 - e-^^- + e-^^+i)" - (1 - e"^*)") 

k=0 

/ (E(l-e"^'e)"-(l-e-^)")di7(x) 
Jo 



poo I " 

^0 Vfe=i 



fc+i ( n 
k 



e-''^{l-Ee) 1 dH{x) 



k=l 



k J 1 - E^^ ■ 



The conditions (5) imply that fi< 00 and < 00. The relation 
(23) lim Ei^„o = i^/M 

follows by an application of [9], Theorem 2(ii), to the formula for KKnfi- 
The cited result relies upon complex analysis and requires a sufficient large 
domain of definition of the Mellin transforms of ^ and ^, which is secured 
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by our assumption (5). We perceive that (23) holds whenever v < oo, but 
have no proof of this conjecture so far. 

For n € No set x„ := ¥,Knfi, := ^_|,^„ . With decrement matrix as in 
(8), we have, according to (15), 

xo = 0, x„= ^ g(n:m)x„_m + r„, n G N, 

m=l 

which is of the same form as [10], (11). Then ><„ is given by 

n-l 

><n = X! 9{n, m)rm + n G N 

m=l 

(compare to [10], (12)), where g{n,m) was defined on page 10. Assuming 
that u = oo and fi< oo and using (16) along with Fatou's lemma gives 

^ 1 - Ef™ E^™ 1 ^ E^"* 
hmmf x„ > > —=- = - > = oo. 



-^^ fim 1 - E^™ /X 



m=l 



where the last series diverges in view of the condition u = oo. □ 

Exactly the same argument as above can be exploited for proving that 
^71,0 + ■ • • + Kn^r converges in distribution. However, for r > 2 calculations 
get complicated and in Proposition 5.2 we content ourselves with the case 
r = 1. 

Proposition 5.2. If v<oo, then, CIS 71 — > OO J Kfi^Q -\- Kfi i^ COTtVCTQCS in 
distribution to a random variable Kqi. If also fx < oo, then 

1 / ^ /'Mdf'^ \ \ 

n^oi >'^} = -(^^+J2 [-J- + ^^'^) ^{^^jfi + ^i.i = 0} j ' 

F{i^oi >i} = - ( Ef -2]E^(1 - E^^) 



+ E — + ^^'^ ) ^{^i.o + Kj,i=i-1}], 



j=2 ^ 

i = 2, 3, . . . , 

and EA'oi = (^^ + l)/^> but if fi = oo, then Kqi = a.s. 

Sketch of the proof. For n, z € No and t > set y„ := Kn,o + Kn,i, 
/^'^(*) = ErTnn = i} and g'^^{t):=e-'f^^\t). 

k=2 
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Use the recursion 

(24) ^0 = 0, y„ = yA- +l(A:>n-l), nGN, 

where A* is independent of {Y^ : A; G N}, to obtain 

^(O) (t) = Eff(o) {to + IEe~*«' - e"* - e-*E/(°)(iO 

(25) 

-te-*Ee(/W(tO + l); 

(26) + E(/(^^) (to + t^(EO*"^EO (e"*«' - e"* - te'^O • 

In the same way as in the proof of Theorem 2.2, we can justify using the key 
renewal theorem in (25) and (26) to get a poissonized version of the result. 
Our depoissonization argument used in the proof of Theorem 2.2 works 
without changes. Justification of the only step that may require explanation 
is as follows: the inequality Knfl + i^n,i < Km,o + Km,i, n<m implies that 
at least one of the exponential points En+i, ■ ■ ■ ,£'m falls to the right from 

En,n ■ n 

6. Proof of Theorem 2.3. Assume that < oo and that {K* — bn)/an 
converges in distribution to a random variable X with some proper and 
nondegenerate probability law. According to Theorem 2.1, the latter can 
occur if one of five conditions (l)-(5) holds and also a„ oo. Notice that 
for each of these conditions there exist distributions that satisfy it together 
with the condition < oo [as, e.g., in Examples 2.5 and 2.6]. By Theorem 
2.2(a) and the Markov inequality, Knfl/an = {K* — Kn)/an goes to in 

probability. Therefore, {Kn — bn)/an X. Using Proposition 5.2 and the 

Markov inequality, we conclude that {Kn — Kn,i — bn)/an X. 

To prove the result for Wn, consider (13) and assume that there exists an 
increasing sequence {n^ : k G N} such that Vn,, — > oo. According to Theorem 

2.2, we get from (13) K^ofi = Koo^ + 1, which is absurd. Thus, the sequence 
{Vn '■ n G N} is bounded in probability, which implies that 

(27) {Uv„ - 1(K = 0) : n G N} is bounded in probability, 

as well. An appeal to (14) allows us to conclude that {Wn — bn)/an — > X. 

Assume now that either {Kn — bn)/an, {Kn — Kn,i — bn)/an, or {Wn — 
bn)/0'n converges in distribution to a random variable X with some proper 
and nondegenerate probability distribution. Essentially in the same way as 
for [but now using also either the result of Theorem 2.2 or Proposition 
5.2 or (27)], we can prove that a„ oo, and the same argument as above 

proves that (K* — bn)/an X. The proof is complete. 
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7. Proof of Theorem 2.4. 

Case fj, < oo. With g{n,m) defined on page 10, we have 
¥{Zn = m} = g{n, m)F{A^ = 0}. 
Since, according to (8), 

an appeal to (16) completes the proof of this case. 

Case /i = oo. Denote U {z) := inf {z — Sn.: Sn < z,n £ No} the undershoot 
at z > 0. For /c G {1, 2, . . . , n} we have 

(28) F{Zn > k} = ¥{UiEn,n) > En,n " ^n-fc.n}- 

Assume first that a G [0,1) and for fixed e G (0,1) set := [n^]- Since 
En^n is independent of the undershoot and tends to +oo in probability, an 
appeal to [6], Theorem 8.6.3, allows us to conclude that 

UiEn,n) d ^ 
J-'n,n 

where the distribution of Zq is 6i, degenerate at point 1, and for a G (0, 1), 
Za has the beta (1 — a, a) distribution. Using the convergence of En,n — logn, 
we obtain £'„^„/logn — > 1 in probability. Since, for x > 0, 



F{E^^n - En-k^,n < x} = {1 - 6 



we can easily check that {En,n — En-k„,n)/ ^ogn A e. Therefore, 

En,n En—kn,Ti d 



En,n 



Now the resuh follows from the relation 



I logn J L En,n En,n J 

Indeed, while in the case a G (0, 1), each e G (0, 1) is a continuity point of the 
distribution of Z^ , in the case a = the relation establishes the convergence 
in probability logZ„/log?i — > 1 (notice that log Z„/ logn < 1 a.s.). 

Consider now the remaining case a = 1. For fixed e G (0,1) set kn := 
[exp(m~^(e7n(logn)))J , where m~^(-) is the increasing and continuous in- 
verse of m{x) = /(^P{— log^ > y}dy, x > 0. Using again the independence 
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of En^n and the undershoot and exploiting [7], Theorem 6, leads to the con- 
clusion 

m{U{En,n)) ^ 
m{En,n) 

where the law of Z is uniform[0, 1]. Fix any i G N. It is well known that m[x) 
is slowly varying at co. Therefore, m*(logx) is also slowly varying at oo, and 
as s I 0, m*(— log(l — e"'')) ~ logs), where / ~ 5 means that the ratio 
f /g goes to one. Applying Proposition 1.5.8 and Theorem 1.7.1' from [6] to 
the equality 

¥.[m\En n)]=n m\- log(l - e-"))e-"" ds 
Jo 

we get E[m*(£'„„,)] ~m*(logn). Similarly, 

~ m*(logexp(?n"^(e?n(logn)))) = e'm*(logn). 

The last two relations (with i = 1 and i = 2) together with Chebyshev's 
inequality imply that 

m(S„,„)/m(logn)4l and m(£;„,„ - i?„_fc„,„)/m(logn) ^ 

Consequently, m{En,ri — En-k„,n) /^i'{En,n) — > To finish the proof, it re- 
mains to note that 



L ?Ti(lognj J 



(28) m{U{En,n)) ^ m{En,n - En-kn,n) 



m{En,n) rn{En,n) 
P{Z>e}. 



APPENDIX 

Asymptotic behavior of the first passage time processes Nt = mi{k > 
1 : 5fc > t}, for 5fc = Xi -|- • • • -|- Xk a zero-delayed random walk with positive 
steps, was investigated by many authors (see, e.g., [5, 8, 18, 19]). The next 
proposition is a summary of results scattered in the literature. 

Proposition A.l. Assume that > a.s. and that the distribution 
of Xi is nonlattice. The following assertions are equivalent: 

(i) There exist functions a{t) > 0,b{t) € M such that, as t — > 00, {Nt — 
b{t))/a{t) converges weakly to a nondegenerate and proper probability law. 



THE BERNOULLI SIEVE REVISITED 21 

(ii) Either the distribution of Xi belongs to the domain of attraction of 
a stable law, or ¥{Xi > x} slowly varies at oo. 

Set fi = KXi and = BXi . 

(a) // (T^ < oo, then, with b{t) = fi^^t and a{t) = (/x~^(7^t)^/^, the limiting 
law is standard normal. 

(b) // (7^ = oo and 

rx 

/ y'^¥{Xi G dy} ~ L{x) as x^oo, 
Jo 

for some L slowly varying at oo, then, with h{t) = /i^-'^i and ait) = 
li~^/'^c{t), where c{t) is any function satisfying 

lim tL{c{t)) / c^ {t) = 1, 

the limiting law is standard normal. 

(c) Assume that the relation 

(29) ¥{Xi>x}r^x^°'L{x) as x^oo, 

where L is some function slowly varying at oo, holds with a E [1,2), 
and that in the case a = 1 also n < oo. Then, with b{t) = fi^^t and 
a{t) = /i~("~'^^)/"c(t) , where c{t) is any function satisfying 

lim iL(c(t))/c"(t) = l, 

the limiting law is a-stable with characteristic function 

t H^exp{-|t|'T(l -a)(cos(7ra/2) +fsin(7ra/2)sgn(t))}, t G M. 

(d) Assume that /i = oo and the relation (29) holds with a = 1. Let c be any 
positive function satisfying \\m.x^ooxL{c{x)) / c{x) = 1 and set ij){x) := 
X /cxp(-c(x)) ^{^1 — y}y^^ ^y- ^(*) be any positive function satisfying 
b{il){t)) ~ 'ip{b{t)) ^ t. Then, with a{t) = b{t)c{b{t))/t, the limiting law is 
1-stable with characteristic function 

t ^ exp{-|t|(7r/2 - ilog \t\ sgn(t))}, t G M. 

(e) If the relation (29) holds with a G [0, 1), then, with b{t) = and a{t) = 
t°'/L{t), the limiting law 6a is a scaled Mittag-Leffler (exponential, if 
a = 0) with moments 
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All the above statements remain valid if the continuous variable t is re- 
placed by discrete variable logn, n € N, as has been used in this paper. 

In the lattice case, when Xi assumes only positive integer values, the 
whole range of possible distributional limits follow from Theorems 1.2, 1.5 
and Proposition 3.1 in [21]. Although the first two of these results were 
formulated for other variables, they apply to Nt as well. As in [21], the same 
asymptotic results are readily extendible to the first passage time processes 
for random walks with positive nonlattice steps. 
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